Researchers: Jean-Claude Bradley, William E Acree Jr., and Andrew SID Lang
All content, models and data are released as CC0 - the default license for all our ONS work.
This page is a duplicate (backup) of the original ONSChallenge page AbrahamSolventModel004
Objective
To investigate the predicted solvation properties of a comprehensive list of sustainable solvents.[1] These solvents will be compared to solvents with known Abraham solvent coefficients with the outlook of both potentially replacing existing solvents with safer sustainable solvents and to find potential new safe solvents whose solvation properties can then be approximated through their predicted solvent coefficients.
Background
The Abraham general solvation model uses the LFER
log P = c + e E + s S + a A + b B + v V
where c,e,s,a,b,v are the solvent coefficients and E,S,A,B,V are the solute descriptors, see this brief discussion of the model. The Abraham coefficients are found via linear regression from measured data. The standard procedure is to allow the c-coefficient (the intercept) to float in the linear regression. It has been suggested that c should not be negative.[2] We suggest that little predictive ability will be lost if we just require c to be zero. This will also allow easier comparison between solvents. Thus in order to compare both current solvents with each other and potential new solvents with current solvents, we decided to re-calculate the coefficients for known solvents e_0, s_0, a_0, b_0, v_0 by making c zero, see Solvents Model 003.
Then the models themselves can be applied to new compounds to get predicted solvent coefficients which in turn can be used to predict log P for compounds with known Abraham descriptors using the following equation:
log P = e_0 E + s_0 S + a_0 A + b_0 B + v_0 V
Procedure
The supplementary data from the paper by Moity et. al. consists of two files containing lists of solvents with names and CAS numbers. The first file of classical organic solvents was downloaded and ChemSpider IDs (CSIDs) and structures (SMILES) were added by cross-referencing the names and CAS numbers on ChemSpider. The second file of green solvents was downloaded and ChemSpider IDs and structures (SMILES) were added by cross-referencing the names and CAS numbers on ChemSpider. There was one row (menthanyl acetate) that listed a CAS for a similar but different compound (menthyl acetate). Both compounds were kept. CDK descriptors were calculated for all solvents using Rajarshi Guha's CDK DESCUI (v 1.4.2). The option to "Add explicit H" was selected and the descriptors were output as csv files (comma delimited). These files were then loaded into R (R i386 3.0.0) to calculate the predicted solvation coefficients using the following code:
library("randomForest") #for modeling (randomForest 4.6-7)
setwd(".../SustainableSolvents")
myclassicdata = read.csv(file="classicdescriptors.csv",head=TRUE,row.names="Title")
mygreendata = read.csv(file="greendescriptors.csv",head=TRUE,row.names="Title")
## load the model for the 'e' coefficient
mydata.rf <- readRDS("erfmodel")
## predict using the random forest model
test.predict <- predict(mydata.rf,myclassicdata)
## write the predictions to the working directory
write.csv(test.predict, file = "RFTestPredictClassice.csv")
## predict using the random forest model
test.predict <- predict(mydata.rf,mygreendata)
## write the predictions to the working directory
write.csv(test.predict, file = "RFTestPredictGreene.csv")
## Similarly for the other coefficients s,a,b, and v
mydata.rf <- readRDS("srfmodel")
test.predict <- predict(mydata.rf,myclassicdata)
write.csv(test.predict, file = "RFTestPredictClassics.csv")
test.predict <- predict(mydata.rf,mygreendata)
write.csv(test.predict, file = "RFTestPredictGreens.csv")
mydata.rf <- readRDS("arfmodel")
test.predict <- predict(mydata.rf,myclassicdata)
write.csv(test.predict, file = "RFTestPredictClassica.csv")
test.predict <- predict(mydata.rf,mygreendata)
write.csv(test.predict, file = "RFTestPredictGreena.csv")
mydata.rf <- readRDS("brfmodel")
test.predict <- predict(mydata.rf,myclassicdata)
write.csv(test.predict, file = "RFTestPredictClassicb.csv")
test.predict <- predict(mydata.rf,mygreendata)
write.csv(test.predict, file = "RFTestPredictGreenb.csv")
mydata.rf <- readRDS("vrfmodel")
test.predict <- predict(mydata.rf,myclassicdata)
write.csv(test.predict, file = "RFTestPredictClassicv.csv")
test.predict <- predict(mydata.rf,mygreendata)
write.csv(test.predict, file = "RFTestPredictGreenv.csv")
Results
The results are listed in the tables below.
Solvents With Known Coefficients
Note the new (updated) coefficients for tributyl phosphate (as compared to the values listed in model003).
c
e
s
a
b
v
solvent
e_0
s_0
a_0
b_0
v_0
0.17
0.4
-1.01
0.06
-3.96
4.04
1-butanol
0.387596
-0.97209
0.108258
-3.97885
4.12895
0.22
0.27
-0.57
-2.92
-4.88
4.46
1-chlorobutane
0.254833
-0.516892
-2.847674
-4.910816
4.57048
-0.06
0.62
-1.32
0.03
-4.15
4.28
1-decanol
0.6203042
-1.3327395
0.0090799
-4.1464701
4.2497972
0.04
0.4
-1.06
0
-4.34
4.32
1-heptanol
0.3948325
-1.0545913
0.0143256
-4.3468223
4.3352649
0.12
0.71
-1.62
-3.18
-4.8
4.32
1-hexadecene
0.696611
-1.588924
-3.143734
-4.810655
4.38202
0.12
0.49
-1.16
0.05
-3.98
4.13
1-hexanol
0.482763
-1.137261
0.090893
-3.993082
4.190662
-0.03
0.49
-1.04
-0.02
-4.24
4.22
1-octanol
0.4909845
-1.0517092
-0.0336817
-4.2309735
4.2009611
0.15
0.54
-1.23
0.14
-3.86
4.08
1-pentanol
0.52385
-1.193867
0.188379
-3.88273
4.154244
0.14
0.41
-1.03
0.25
-3.77
3.99
1-propanol
0.393466
-0.996062
0.291203
-3.784515
4.05757
0.18
0.29
-0.13
-2.8
-4.29
4.18
1,2-dichloroethane
0.278894
-0.090721
-2.742532
-4.313978
4.274203
0.12
0.35
-0.03
-0.58
-4.81
4.11
1,4-dioxane
0.33675
-0.003994
-0.542209
-4.825876
4.173499
0.1
0.62
-1.8
-3.07
-4.29
4.52
1,9-decadiene
0.606347
-1.771424
-3.037034
-4.304097
4.571908
0.13
0.25
-0.98
0.16
-3.88
4.11
2-butanol
0.242337
-0.945988
0.198721
-3.897972
4.179391
0.19
0.35
-1.13
0.02
-3.57
3.97
2-methyl-1-propanol
0.338508
-1.082867
0.07594
-3.591543
4.064762
0.21
0.17
-0.95
0.33
-4.09
4.11
2-methyl-2-propanol
0.153709
-0.897144
0.397899
-4.111567
4.217508
0.12
0.46
-1.33
0.21
-3.75
4.2
2-pentanol
0.445389
-1.303994
0.243289
-3.759434
4.260301
0.1
0.34
-1.05
0.41
-3.83
4.03
2-propanol
0.334933
-1.025916
0.437909
-3.839249
4.084005
0.32
0.51
-1.69
-3.69
-4.81
4.4
2,2,4-trimethylpentane
0.485433
-1.610035
-3.586041
-4.850718
4.563507
0.07
0.36
-1.27
0.09
-3.77
4.4
3-methyl-1-butanol
0.35352
-1.255543
0.11342
-3.779382
4.43679
0.31
0.31
-0.12
-0.61
-4.75
3.94
acetone
0.286706
-0.04747
-0.508846
-4.792269
4.102844
0.41
0.08
0.33
-1.57
-4.39
3.36
acetonitrile
0.044129
0.423135
-1.4362
-4.443325
3.576285
0.14
0.46
-0.59
-3.01
-4.63
4.49
benzene
0.452175
-0.554143
-2.963555
-4.643338
4.564318
0.1
0.29
0.06
-1.61
-4.56
4.03
benzonitrile
0.277045
0.081936
-1.574291
-4.574904
4.07839
-0.02
0.44
-0.42
-3.17
-4.56
4.45
bromobenzene
0.4369346
-0.4279276
-3.1781219
-4.5563083
4.4367652
0.25
0.26
-0.08
-0.77
-4.86
4.15
butanone
0.236179
-0.022077
-0.68909
-4.886268
4.274532
0.25
0.36
-0.5
-0.87
-4.97
4.28
butyl acetate
0.336024
-0.442788
-0.788152
-5.004535
4.408761
0.05
0.69
-0.94
-3.6
-5.82
4.92
carbon disulfide
0.6819348
-0.9318396
-3.5870443
-5.8248542
4.9458403
0.2
0.52
-1.16
-3.56
-4.59
4.62
carbon tetrachloride
0.506806
-1.112282
-3.496545
-4.619008
4.720644
0.07
0.38
-0.52
-3.18
-4.7
4.61
chlorobenzene
0.3752336
-0.5056045
-3.1613969
-4.7083071
4.6478423
0.19
0.11
-0.4
-3.11
-3.51
4.4
chloroform
0.089413
-0.357874
-3.051291
-3.537934
4.493193
0.16
0.78
-1.68
-3.74
-4.93
4.58
cyclohexane
0.770769
-1.640414
-3.689346
-4.948869
4.659072
0.04
0.23
0.06
-0.98
-4.84
4.32
cyclohexanone
0.2216766
0.0668337
-0.962833
-4.8469389
4.3348761
0.19
0.72
-1.74
-3.45
-4.97
4.48
decane
0.706653
-1.697274
-3.389851
-4.993254
4.571974
0.18
0.39
-0.99
-1.41
-5.36
4.52
dibutyl ether
0.37973
-0.94367
-1.357836
-5.379393
4.614845
0.33
0.3
-0.44
0.36
-4.9
3.95
dibutylformamide
0.275223
-0.3577
0.462459
-4.944045
4.122737
0.32
0.1
-0.19
-3.06
-4.09
4.32
dichloromethane
0.076325
-0.111839
-2.957248
-4.130085
4.487963
0.35
0.36
-0.82
-0.59
-4.96
4.35
diethyl ether
0.329941
-0.737491
-0.477868
-5.000297
4.530001
0.21
0.03
0.09
1.34
-5.08
4.09
diethylacetamide
0.0167
0.139253
1.409338
-5.111165
4.197615
-0.27
0.08
0.21
0.92
-5
4.56
dimethylacetamide
0.104844
0.145499
0.831639
-4.970213
4.418588
-0.31
-0.06
0.34
0.36
-4.87
4.49
DMF
-0.034208
0.27126
0.264139
-4.827615
4.330082
-0.19
0.33
0.79
1.26
-4.54
3.36
DMSO
0.341749
0.745718
1.200253
-4.516908
3.262081
0.11
0.67
-1.64
-3.55
-5.01
4.46
dodecane
0.658583
-1.616828
-3.508611
-5.020509
4.517868
0.22
0.47
-1.04
0.33
-3.6
3.86
ethanol
0.453079
-0.983328
0.396198
-3.623493
3.971272
-0.17
-0.02
0
0.07
-0.37
0.45
ethanol/water(10:90)vol
-0.009316
-0.04156
0.010626
-0.350331
0.365271
-0.25
0.04
-0.04
0.1
-0.83
0.92
ethanol/water(20:80)vol
0.062777
-0.098993
0.01722
-0.800521
0.786631
-0.27
0.11
-0.1
0.13
-1.32
1.41
ethanol/water(30:70)vol
0.127804
-0.160996
0.049426
-1.282852
1.276293
-0.22
0.13
-0.16
0.17
-1.81
1.92
ethanol/water(40:60)vol
0.148332
-0.210817
0.102648
-1.781936
1.804907
-0.14
0.12
-0.25
0.25
-2.28
2.42
ethanol/water(50:50)vol
0.134901
-0.285203
0.207128
-2.257463
2.342294
-0.04
0.14
-0.34
0.29
-2.68
2.81
ethanol/water(60:40)vol
0.1406465
-0.3442878
0.281256
-2.6697239
2.7916452
0.06
0.09
-0.37
0.31
-2.94
3.1
ethanol/water(70:30)vol
0.0794107
-0.3529953
0.331463
-2.9438845
3.1344568
0.17
0.18
-0.47
0.26
-3.21
3.32
ethanol/water(80:20)vol
0.161026
-0.424495
0.314211
-3.233463
3.411426
0.24
0.21
-0.58
0.26
-3.45
3.55
ethanol/water(90:10)vol
0.193477
-0.51767
0.338521
-3.480739
3.669878
0.33
0.37
-0.45
-0.7
-4.9
4.15
ethyl acetate
0.342809
-0.369036
-0.596948
-4.94523
4.318697
0.09
0.47
-0.72
-3
-4.84
4.51
ethylbenzene
0.459437
-0.701228
-2.970828
-4.855741
4.56218
-0.27
0.58
-0.51
0.72
-2.62
2.73
ethylene glycol
0.599449
-0.574819
0.631321
-2.585314
2.5908
0.14
0.15
-0.37
-3.03
-4.6
4.54
fluorobenzene
0.140337
-0.340978
-2.985464
-4.618238
4.611483
-0.17
0.07
0.31
0.59
-3.15
2.43
formamide
0.083307
0.267828
0.536608
-3.131516
2.344771
0.3
0.64
-1.76
-3.57
-4.95
4.49
heptane
0.61919
-1.685129
-3.477189
-4.983132
4.640776
0.09
0.67
-1.62
-3.59
-4.87
4.43
hexadecane
0.659893
-1.59632
-3.559573
-4.880281
4.47815
0.33
0.56
-1.71
-3.58
-4.94
4.46
hexane
0.53342
-1.631725
-3.473425
-4.980698
4.634317
-0.19
0.3
-0.31
-3.21
-4.65
4.59
iodobenzene
0.312539
-0.352762
-3.271785
-4.629052
4.489752
-0.61
0.93
-1.15
-1.68
-4.09
4.25
isopropyl myristate
0.977259
-1.294959
-1.870114
-4.017729
3.939081
0.12
0.38
-0.6
-2.98
-4.96
4.54
m-xylene
0.366587
-0.574078
-2.941283
-4.976929
4.598299
0.28
0.33
-0.71
0.24
-3.32
3.55
methanol
0.311909
-0.649107
0.329542
-3.354582
3.690751
0.35
0.22
-0.15
-1.04
-4.53
3.97
methyl acetate
0.194997
-0.067588
-0.923983
-4.571216
4.15239
0.34
0.31
-0.82
-0.62
-5.1
4.43
methyl tert-butyl ether
0.279699
-0.737134
-0.510026
-5.139775
4.600429
0.25
0.78
-1.98
-3.52
-4.29
4.53
methylcyclohexane
0.762327
-1.924196
-3.439318
-4.323834
4.654703
0.28
0.13
-0.44
1.18
-4.73
3.86
N-ethylacetamide
0.105071
-0.374993
1.269385
-4.764184
4.002187
0.22
0.03
-0.17
0.94
-4.59
3.73
N-ethylformamide
0.016449
-0.114321
1.004651
-4.616979
3.84333
-0.03
0.7
-0.06
0.01
-4.09
3.41
N-formylmorpholine
0.6981457
-0.0694897
0.0048883
-4.0885654
3.38906
0.06
0.33
0.26
1.56
-5.04
3.98
N-methyl-2-piperidone
0.3271873
0.2705115
1.5746338
-5.0436057
4.0124292
0.09
0.21
-0.17
1.31
-4.59
3.83
N-methylacetamide
0.19721
-0.150831
1.334533
-4.600626
3.879615
0.11
0.41
-0.29
0.54
-4.09
3.47
N-methylformamide
0.397604
-0.260136
0.578616
-4.099689
3.529845
0.15
0.53
0.23
0.84
-4.79
3.67
N-methylpyrrolidinone
0.519565
0.259902
0.887089
-4.813222
3.749914
-0.2
0.54
0.04
-2.33
-4.61
4.31
nitrobenzene
0.551741
-0.003723
-2.388352
-4.584066
4.213974
0.02
-0.09
0.79
-1.46
-4.36
3.46
nitromethane
-0.0933342
0.7985957
-1.4544755
-4.3676129
3.4722537
0.24
0.62
-1.71
-3.53
-4.92
4.48
nonane
0.599859
-1.65665
-3.456521
-4.951366
4.605711
0.08
0.52
-0.81
-2.88
-4.82
4.56
o-xylene
0.511059
-0.793315
-2.857401
-4.831364
4.601817
-0.1
0.15
-0.84
-0.44
-4.04
4.13
octadecanol
0.155261
-0.863525
-0.466854
-4.028093
4.075935
0.23
0.74
-1.84
-3.59
-4.91
4.5
octane
0.719433
-1.785636
-3.512058
-4.936095
4.620999
0.17
0.48
-0.81
-2.94
-4.87
4.53
p-xylene
0.463092
-0.772761
-2.885801
-4.895116
4.617725
0.37
0.39
-1.57
-3.54
-5.22
4.51
pentane
0.35651
-1.481294
-3.418818
-5.261024
4.703599
0
0.17
0.5
-1.28
-4.41
3.42
propylene carbonate
0.1672359
0.505135
-1.2809844
-4.4080414
3.4234811
0
0.15
0.6
-0.38
-4.54
3.29
sulfolane
0.1468503
0.6009136
-0.3799049
-4.541574
3.2903215
0.22
0.36
-0.38
-0.24
-4.93
4.45
THF
0.345051
-0.331628
-0.167145
-4.96046
4.564853
0.13
0.43
-0.64
-3
-4.75
4.52
toluene
0.420597
-0.614527
-2.961869
-4.763681
4.588524
0.02
0.35
-0.43
0.71
-4.73
4.19
tributyl phosphate
0.3482619
-0.4268337
0.7148701
-4.7277409
4.2032817
0.4
-0.09
-0.59
-1.28
-1.27
3.09
trifluoroethanol
-0.125647
-0.50143
-1.155862
-1.322677
3.290636
0.06
0.6
-1.66
-3.42
-5.12
4.62
undecane
0.5979334
-1.6471654
-3.4017847
-5.1276719
4.6493317
Predicted Coefficients e_0, s_0, a_0, b_0, and v_0 for classic solvents
solvent
CAS
CSID
SMILES
e_0
s_0
a_0
b_0
v_0
n-butylamine
109-73-9
7716
CCCCN
0.401
-0.861
-0.757
-4.572
4.266
diethylamine
109-89-7
7730
CCNCC
0.277
-0.653
-0.77
-4.684
4.431
n-propylamine
107-10-8
7564
CCCN
0.394
-0.809
-0.664
-4.412
4.159
piperidine
110-89-4
7791
C1CCNCC1
0.413
-0.703
-0.934
-4.664
4.434
pyrrolidine
123-75-1
29008
C1CCNC1
0.375
-0.555
-0.812
-4.653
4.387
tributylamine
102-82-9
7340
CCCCN(CCCC)CCCC
0.535
-1.086
-1.635
-4.942
4.432
triethylamine
121-44-8
8158
CCN(CC)CC
0.367
-0.545
-1.349
-4.943
4.504
tert-butyl methyl ether
1634-04-4
14672
CC(C)(C)OC
0.285
-0.71
-0.534
-4.898
4.498
acetone
67-64-1
175
CC(=O)C
0.252
-0.063
-0.389
-4.721
4.092
DMEU
80-73-9
6409
CN1CCN(C1=O)C
0.315
0.175
0.319
-4.7
3.702
DPMU
7226-23-5
73671
CN1CCCN(C1=O)C
0.355
0.119
0.259
-4.748
3.85
N,N-dimethylacetamide
127-19-5
29107
CC(=O)N(C)C
0.173
0.071
0.579
-4.781
4.079
N,N-dimethylformamide
68-12-2
5993
CN(C)C=O
0.187
0.014
0.111
-4.521
3.827
2,4-dimethylpyridine
108-47-4
21132380
CC1=CC(=NC=C1)C
0.312
-0.364
-0.716
-4.846
4.417
2,6-dimethylpyridine
108-48-5
13842613
CC1=NC(=CC=C1)C
0.328
-0.344
-0.811
-4.865
4.407
ethylenediamine
107-15-3
13835550
C(CN)N
0.439
-0.604
-0.369
-4.133
3.538
HMPTA
680-31-9
12158
CN(C)P(=O)(N(C)C)N(C)C
0.289
0.052
0.2
-4.41
3.645
N-methyl-pyrrolidin-2-one
872-50-4
12814
CN1CCCC1=O
0.408
0.195
0.217
-4.68
3.827
morpholine
110-91-8
13837537
C1COCCN1
0.324
-0.346
0.053
-4.308
3.938
3-picoline
108-99-6
21106520
CC1=CN=CC=C1
0.299
-0.317
-0.625
-4.724
4.418
4-picoline
108-89-4
13874733
CC1=CC=NC=C1
0.314
-0.35
-0.641
-4.716
4.46
pyridine
110-86-1
1020
C1=CC=NC=C1
0.316
-0.35
-0.65
-4.682
4.399
quinoline
91-22-5
6780
C1=CC=C2C(=C1)C=CC=N2
0.391
-0.343
-1.184
-4.643
4.38
tetrahydrofuran
109-99-9
7737
C1CCOC1
0.344
-0.346
-0.369
-4.87
4.465
1,1,3,3-tetramethyl urea
632-22-4
11930
CN(C)C(=O)N(C)C
0.193
0.118
0.34
-4.732
3.914
2,4,6-trimethylpyridine
108-75-8
21106174
CC1=CC(=NC(=C1)C)C
0.331
-0.427
-0.931
-4.827
4.374
triethylene glycol
112-27-6
13835895
C(COCCOCCO)O
0.497
-0.416
-0.149
-3.292
3.484
1-chlorobutane
109-69-3
7714
CCCCCl
0.315
-0.642
-2.847
-4.684
4.534
acetophenone
98-86-2
7132
CC(=O)C1=CC=CC=C1
0.343
-0.199
-0.751
-4.753
4.179
benzaldehyde
100-52-7
235
C1=CC=C(C=C1)C=O
0.323
-0.183
-0.802
-4.569
4.152
2-butanone
78-93-3
6321
CCC(=O)C
0.229
-0.128
-0.536
-4.814
4.208
n-butyl acetate
123-86-4
29012
CCCCOC(=O)C
0.321
-0.434
-0.803
-4.887
4.298
di-n-butyl ether
142-96-1
8569
CCCCOCCCC
0.396
-0.874
-1.009
-5.155
4.548
butyronitrile
109-74-0
7717
CCCC#N
0.247
-0.296
-1.057
-4.556
4.137
cyclohexanone
108-94-1
7679
C1CCC(=O)CC1
0.279
-0.004
-0.793
-4.779
4.227
cyclopentanone
120-92-3
8141
C1CCC(=O)C1
0.298
0.031
-0.563
-4.765
4.164
DEGDEE
112-36-7
21106583
CCOCCOCCOCC
0.367
-0.377
-0.574
-4.503
4.043
DEGDME
111-96-6
13839575
COCCOCCOC
0.38
-0.514
-0.417
-4.324
3.922
dibenzyl ether
103-50-4
21105876
C1=CC=C(C=C1)COCC2=CC=CC=C2
0.472
-0.698
-0.972
-4.701
4.333
diethyl carbonate
105-58-8
7478
CCOC(=O)OCC
0.318
-0.262
-0.968
-4.629
4.063
diethyl ether
60-29-7
3168
CCOCC
0.314
-0.625
-0.517
-4.932
4.464
di-isopropyl ether
108-20-3
7626
CC(C)OC(C)C
0.284
-0.472
-0.639
-4.838
4.488
1,2-dimethoxyethane
110-71-4
13836589
COCCOC
0.36
-0.547
-0.379
-4.524
4.038
3,3-dimethyl-2-butanone
75-97-8
6176
CC(=O)C(C)(C)C
0.25
-0.262
-0.574
-4.877
4.237
2,6-dimethyl-4-heptanone
108-83-8
7670
CC(C)CC(=O)CC(C)C
0.346
-0.593
-0.718
-4.826
4.247
2,4-dimethyl-3-pentanone
565-80-0
10797
CC(C)C(=O)C(C)C
0.281
-0.375
-0.547
-4.822
4.279
1,4-dioxane
123-91-1
29015
C1COCCO1
0.345
-0.189
-0.416
-4.497
4.084
ethyl acetate
141-78-6
8525
CCOC(=O)C
0.262
-0.298
-0.7
-4.868
4.252
ethyl benzoate
93-89-0
6897
CCOC(=O)C1=CC=CC=C1
0.366
-0.297
-1.161
-4.65
4.127
ethyl formate
109-94-4
7734
CCOC=O
0.227
-0.116
-0.805
-4.483
3.778
ethyl propionate
105-37-3
7463
CCC(=O)OCC
0.249
-0.284
-0.646
-4.873
4.236
methyl acetate
79-20-9
6335
CC(=O)OC
0.217
-0.055
-0.657
-4.563
3.919
methyl benzoate
93-58-3
6883
COC(=O)C1=CC=CC=C1
0.36
-0.225
-1.149
-4.63
4.109
3-methyl-2-butanone
563-80-4
10777
CC(C)C(=O)C
0.22
-0.159
-0.405
-4.826
4.265
4-methyl-2-pentanone
108-10-1
7621
CC(C)CC(=O)C
0.249
-0.417
-0.527
-4.804
4.276
2-pentanone
107-87-9
7607
CCCC(=O)C
0.275
-0.4
-0.511
-4.887
4.244
3-pentanone
96-22-0
7016
CCC(=O)CC
0.252
-0.326
-0.537
-4.899
4.25
pentyl actetae
628-63-7
11843
CCCCCOC(=O)C
0.378
-0.5
-0.818
-4.832
4.259
di-n-propyl ether
111-43-3
7823
CCCOCCC
0.358
-0.727
-0.66
-5.091
4.54
propyl formate
110-74-7
7782
CCCOC=O
0.295
-0.242
-0.734
-4.604
3.939
TEGDME
143-24-8
13835433
COCCOCCOCCOCCOC
0.407
-0.552
-0.505
-4.105
3.644
benzonitrile
100-47-0
7224
C1=CC=C(C=C1)C#N
0.302
-0.106
-1.427
-4.566
4.159
chloroform
67-66-3
5977
C(Cl)(Cl)Cl
0.203
-0.405
-3.075
-4.029
4.504
acetic anhydride
108-24-7
7630
CC(=O)OC(=O)C
0.219
-0.008
-0.604
-4.652
4.007
acetonitrile
75-05-8
6102
CC#N
0.119
0.132
-1.151
-4.25
3.656
acetylacetone
123-54-6
29001
CC(=O)CC(=O)C
0.233
-0.057
-0.279
-4.703
4.029
dimethylsulfoxide
67-68-5
659
CS(=O)C
0.304
0.375
0.599
-4.534
3.542
N-methylacetamide
79-16-3
6334
CC(=O)NC
0.139
0.021
0.605
-4.574
4.031
N-methylformamide
123-39-7
28994
CNC=O
0.245
-0.096
0.323
-4.182
3.527
methyl formate
107-31-3
7577
COC=O
0.216
0.015
-0.647
-4.146
3.569
propionitrile
107-12-0
7566
CCC#N
0.233
-0.144
-1.049
-4.459
3.949
propylene carbonate
108-32-7
7636
CC1COC(=O)O1
0.223
0.302
-1.03
-4.388
3.577
sulfolane
126-33-0
29080
C1CCS(=O)(=O)C1
0.238
0.398
-0.405
-4.607
3.567
2-butanol
78-92-2
6320
CCC(C)O
0.294
-0.972
0.186
-3.85
4.155
benzene
71-43-2
236
C1=CC=CC=C1
0.445
-0.612
-3.109
-4.697
4.565
bromobenzene
108-86-1
7673
C1=CC=C(C=C1)Br
0.42
-0.506
-3.167
-4.581
4.499
1-bromobutane
109-65-9
7711
CCCCBr
0.34
-0.688
-2.993
-4.626
4.549
bromoethane
74-96-4
6092
CCBr
0.361
-0.614
-2.995
-4.606
4.569
carbon disulfide
75-15-0
6108
C(=S)=S
0.485
-0.683
-3.02
-5.194
4.628
carbon tetrachloride
56-23-5
5730
C(Cl)(Cl)(Cl)Cl
0.404
-0.782
-3.224
-4.305
4.558
chlorobenzene
108-90-7
7676
C1=CC=C(C=C1)Cl
0.385
-0.483
-3.087
-4.617
4.575
3-methyl-1-butanol
123-51-3
29000
CC(C)CCO
0.349
-1.101
0.102
-3.833
4.283
1-chloropropane
540-54-5
10437
CCCCl
0.324
-0.623
-2.795
-4.649
4.536
2-chloropropane
75-29-6
6121
CC(C)Cl
0.327
-0.674
-2.939
-4.559
4.552
cis-decaline
91-17-8
10179239
C1C[C@@H]2[C@@H](CCCC2)CC1
0.643
-1.529
-3.488
-4.689
4.557
cyclohexane
110-82-7
7787
C1CCCCC1
0.663
-1.478
-3.524
-4.88
4.653
cyclohexene
110-83-8
7788
C1CCC=CC1
0.475
-0.893
-3.132
-4.741
4.642
cyclopentane
287-92-3
8896
C1CCCC1
0.535
-1.133
-3.416
-4.897
4.658
n-decane
124-18-5
14840
CCCCCCCCCC
0.667
-1.682
-3.427
-4.988
4.592
o-dichlorobenzene
95-50-1
13837988
C1=CC=C(C(=C1)Cl)Cl
0.412
-0.539
-3.075
-4.593
4.448
1,1-dichloroethylene
75-35-4
13835316
C=C(Cl)Cl
0.346
-0.554
-3.143
-4.492
4.55
m-dichlorobenzene
541-73-1
13857694
C1=CC(=CC(=C1)Cl)Cl
0.391
-0.581
-3.061
-4.588
4.459
N,N-dimethylaniline
121-69-7
924
CN(C)C1=CC=CC=C1
0.39
-0.437
-1.318
-4.757
4.433
diphenyl ether
101-84-8
7302
C1=CC=C(C=C1)OC2=CC=CC=C2
0.418
-0.644
-1.778
-4.664
4.372
fluorobenzene
462-06-6
9614
C1=CC=C(C=C1)F
0.288
-0.372
-2.385
-4.599
4.506
n-heptane
142-82-5
8560
CCCCCCC
0.609
-1.673
-3.457
-4.98
4.633
n-hexane
110-54-3
7767
CCCCCC
0.539
-1.618
-3.423
-5.001
4.643
iodobenzene
591-50-4
11087
C1=CC=C(C=C1)I
0.406
-0.659
-3.217
-4.572
4.509
iodoethane
75-03-6
6100
CCI
0.433
-0.727
-3.147
-4.685
4.582
mesitylene
108-67-8
7659
CC1=CC(=CC(=C1)C)C
0.494
-1.032
-3.149
-4.91
4.567
iso-octane
540-84-1
10445
CC(C)CC(C)(C)C
0.509
-1.53
-3.446
-4.837
4.585
n-octane
111-65-9
349
CCCCCCCC
0.666
-1.708
-3.474
-4.961
4.621
n-pentane
109-66-0
7712
CCCCC
0.412
-1.414
-3.316
-5.057
4.639
phenetole
103-73-1
7391
CCOC1=CC=CC=C1
0.354
-0.424
-1.636
-4.8
4.491
styrene
100-42-5
7220
C=CC1=CC=CC=C1
0.472
-0.732
-3.147
-4.73
4.57
tetrachloroethylene
127-18-4
13837281
C(=C(Cl)Cl)(Cl)Cl
0.354
-0.482
-3.045
-4.235
4.348
toluene
108-88-3
1108
CC1=CC=CC=C1
0.429
-0.609
-2.979
-4.751
4.588
1,1,1-trichloroethane
71-55-6
6042
CC(Cl)(Cl)Cl
0.321
-0.636
-3.192
-4.384
4.549
trichloroethylene
79-01-6
13837280
C(=C(Cl)Cl)Cl
0.292
-0.344
-2.959
-4.395
4.347
m-xylene
108-38-3
7641
CC1=CC(=CC=C1)C
0.408
-0.689
-2.945
-4.919
4.594
o-xylene
95-47-6
6967
CC1=CC=CC=C1C
0.473
-0.747
-2.932
-4.867
4.595
p-xylene
106-42-3
7521
CC1=CC=C(C=C1)C
0.455
-0.747
-2.907
-4.89
4.597
aniline
62-53-3
5889
C1=CC=C(C=C1)N
0.351
-0.461
-1.055
-4.481
4.324
anisole
100-66-3
7238
COC1=CC=CC=C1
0.356
-0.344
-1.516
-4.723
4.501
1,1-dichloroethane
75-34-3
6125
CC(Cl)Cl
0.282
-0.416
-2.988
-4.365
4.497
1,2-dichloroethane
107-06-2
13837650
C(CCl)Cl
0.299
-0.206
-2.757
-4.439
4.352
Z-1,2-dichloroethylene
156-59-2
558928
C(=C\\Cl)\\Cl
0.327
-0.387
-2.884
-4.475
4.465
dichloromethane
75-09-2
6104
C(Cl)Cl
0.159
-0.261
-2.882
-4.372
4.516
nitrobenzene
98-95-3
7138
C1=CC=C(C=C1)[N+](=O)[O-]
0.443
-0.119
-1.974
-4.527
4.176
nitroethane
79-24-3
6338
CC[N+](=O)[O-]
0.243
-0.192
-1.409
-4.449
3.797
nitromethane
75-52-5
6135
C[N+](=O)[O-]
0.103
0.228
-1.477
-4.079
3.534
1,1,2,2-tetrachloroethane
79-34-5
6342
C(C(Cl)Cl)(Cl)Cl
0.328
-0.504
-3.026
-4.058
4.325
1-butanol
71-36-3
258
CCCCO
0.396
-0.986
0.145
-3.914
4.127
benzyl alcohol
100-51-6
13860335
C1=CC=C(C=C1)CO
0.365
-0.399
-0.381
-3.949
4.143
tert-butyl alcohol
75-65-0
6146
CC(C)(C)O
0.27
-0.889
0.228
-3.979
4.197
cyclohexanol
108-93-0
7678
C1CCC(CC1)O
0.369
-0.726
-0.267
-4.035
4.224
ethanol
64-17-5
682
CCO
0.378
-0.906
0.337
-3.574
3.819
1-hexanol
111-27-3
7812
CCCCCCO
0.446
-1.065
0.069
-3.98
4.199
isobutyl alcohol
78-83-1
6312
CC(C)CO
0.327
-0.997
0.125
-3.734
4.121
2-methyl-2-butanol
75-85-4
6165
CCC(C)(C)O
0.285
-0.925
0.042
-3.941
4.29
tetraethylene glycol
112-60-7
7908
C(COCCOCCOCCO)O
0.49
-0.423
-0.31
-3.297
3.495
1-pentanol
71-41-0
6040
CCCCCO
0.448
-1.071
0.13
-3.919
4.162
2-pentanol
6032-29-7
21011
CCCC(C)O
0.399
-1.147
0.152
-3.847
4.22
3-pentanol
584-02-1
10947
CCC(CC)O
0.355
-1.044
0.108
-3.93
4.23
1-propanol
71-23-8
1004
CCCO
0.378
-0.953
0.276
-3.795
4.038
2-propanol
67-63-0
3644
CC(C)O
0.332
-0.956
0.341
-3.781
4.079
1-octanol
111-87-5
932
CCCCCCCCO
0.461
-1.066
-0.057
-4.141
4.204
2-aminoethanol
141-43-5
13835336
C(CO)N
0.432
-0.5
0.351
-3.503
3.353
ethylene glycol
107-21-1
13835235
C(CO)O
0.493
-0.451
0.375
-2.968
2.936
diethylene glycol
111-46-6
13835180
C(COCCO)O
0.48
-0.421
-0.062
-3.361
3.478
trimethylene glycol
504-63-2
13839553
C(CO)CO
0.434
-0.627
0.236
-3.726
3.6
glycerol
56-81-5
733
C(C(CO)O)O
0.405
-0.43
0.076
-3.421
3.476
propylene glycol
57-55-6
13835224
CC(CO)O
0.387
-0.447
0.259
-3.447
3.586
methanol
67-56-1
864
CO
0.286
-0.631
0.312
-3.417
3.572
2-methoxyethanol
109-86-4
7728
COCCO
0.385
-0.472
0.081
-3.497
3.659
furfuryl alcohol
98-00-0
7083
C1=COC(=C1)CO
0.351
-0.195
-0.047
-3.553
3.88
acetic acid
64-19-7
171
CC(=O)O
0.193
0.016
-0.103
-3.625
3.565
m-cresol
108-39-4
21105871
CC1=CC(=CC=C1)O
0.377
-0.591
-1.325
-4.069
4.301
phenol
108-95-2
971
C1=CC=C(C=C1)O
0.342
-0.551
-1.337
-4.003
4.324
trifluoroacetic acid
76-05-1
10239201
C(=O)(C(F)(F)F)O
0.127
0.009
-0.675
-2.89
3.397
2,2,2-trifluoroethanol
75-89-8
21106169
C(C(F)(F)F)O
0.032
-0.46
-0.768
-2.499
3.504
formamide
75-12-7
693
C(=O)N
0.138
0.23
0.437
-3.584
2.893
water
7732-18-5
937
O
0.305
-0.832
-1.252
-3.528
3.762
Predicted Coefficients e_0, s_0, a_0, b_0, and v_0 for green solvents
To explore the chemical space, each coefficient was multiplied by the average values for solute coefficients (E=0.884, S=0.996, A=0.174, B=0.487, V=1.299) from ADModel003. Then PCA was computed for (eE, sS, aA, bB, vV) using the following code:
The following image was generated using Tableau Public v8.2. Solvents with known (measured) coefficients are squares and solvents with ony predicted coefficients are circles. Green solvents are coloured green.
Lots of the green solvents around methyl palmitate seem to occupy a new region of the chemical space not covered by current solvents with measured coefficients. Similarly ethyl lactate and acetic acid seem to be possibly useful novel solvents with different solvation properties than those with measured coefficients. On the other hand, there are not many green solvent alternatives for classical solvents around p-xylene.
Alternative Solvents
We can use chemical space information to make general green solvent recommendations (as is done with GSK and others). For example the predicted solvent coefficients for propylene glycol (green) are 'close' to the experimental values for methanol.
propylene glycol
57-55-6
13835224
CC(CO)O
0.387
-0.447
0.259
-3.447
3.586
methanol
67-56-1
864
CO
0.286
-0.631
0.312
-3.417
3.572
This suggests that propylene glycol can used as a general substitute for methanol. To examine this we compare the solubility values of compounds that have values in both solvents. We do not expect to see exact agreement (as the coefficients are firstly predicted and secondly not identical) but we do see that solubility values are of the same order in most cases. The biggest discrepancy being for dimethyl fumerate. The measured solubility values are reported to be 0.182 M and 0.005 M for methanol and propylene glycol respectively. This is not something one would necessarily expect and the reported value or values may be incorrect. Substituting propylene glycol for methanol is just one of many recommendations that can be made using this technique.
Similarly, by measuring the difference between two solvent logP values (and using the compound-specific solute descriptors E,S,A,B,V) as follows:
we get a good idea of pairs of solvents with similar solvation properties for specific individual compounds (the closer to zero, the better). By comparing classical (non-green) solvents with all green solvents, we can make recommended green solvent replacements for non-green solvents. As an example, we calculate recommended green solvent replacements for benzoic acid, see table below, using descriptors E=0.730, S=0.90, A=0.59, B=0.40, V=0.9317 (using d=0.01).[3] The recommendations make sense in general and several examples can be explicitly verified by comparing actual measured solubility values.[4]. Such a procedure can easily be done for other specific compounds with known or predicted Abraham descriptors to find alternative green solvents in varying specific circumstances (solubility, partition, etc). Note the solvents at the bottom with no green alternatives.
classic solvent
green
green
green
green
green
1-propanol
Glycerol-1-ethyl monoether
Glycerol-2-ethyl monoether
1,3-Dioxolane-4-methanol
glycerol
Cyclopentyl methyl ether
Glycerol-1,3-dimethyl ether
2-Furfuraldehyde
acetylacetone
Glycerol-1-ethyl monoether
Glycerol-2-ethyl monoether
Caprylic acid diethanolamide
glycerol
Cyclopentyl methyl ether
Glycerol-1,3-dimethyl ether
2-Furfuraldehyde
3-methyl-2-butanone
3-methyl-1-butanol
Glycerol-1-ethyl monoether
Glycerol-2-ethyl monoether
Caprylic acid diethanolamide
glycerol
Cyclopentyl methyl ether
2-Furfuraldehyde
2-methyl-2-propanol
3-methyl-1-butanol
Glycerol-1-ethyl monoether
Glycerol-2-ethyl monoether
Caprylic acid diethanolamide
glycerol
Cyclopentyl methyl ether
2-Furfuraldehyde
triethylene glycol
Glycerol-1-ethyl monoether
Glycerol-2-ethyl monoether
1,3-Dioxolane-4-methanol
glycerol
Cyclopentyl methyl ether
Glycerol-1,3-dimethyl ether
diethylene glycol
Glycerol-1-ethyl monoether
Glycerol-2-ethyl monoether
1,3-Dioxolane-4-methanol
glycerol
Cyclopentyl methyl ether
Glycerol-1,3-dimethyl ether
nitroethane
Ethyl laurate
Dibutyl sebacate
Butyl myristate
Butyl palmitate
Methyl laurate
Butyl stearate
methanol
ethanol
Solketal
Glycerol-1,3-diethyl ether
Glycerol-1,2-diethyl ether
Propionic acid
cyclohexanone
gamma-Valerolactone
N,N-Dimethyloctanamide
3-Hydroxypropionic acid
trimethylene glycol
Glycerol-2-methyl monoether
3-pentanol
gamma-Valerolactone
N,N-Dimethyloctanamide
3-Hydroxypropionic acid
trimethylene glycol
Glycerol-2-methyl monoether
4-picoline
ethanol/water(80:20)vol
Glycerol-1,2-dimethyl ether
Glycofurol (n=2)
Glycerol-1-methyl monoether
Glycerol carbonate
acetophenone
1-octanol
ethyl acetate
Glycerol-1,2,3-trimethyl ether
PolyEthyleneGlycol 600
Dimethyl isosorbide
2-methyl-2-butanol
3-methyl-1-butanol
Caprylic acid diethanolamide
glycerol
Cyclopentyl methyl ether
2-Furfuraldehyde
DEGDME
butyl acetate
ethanol/water(60:40)vol
n-Propyl acetate
Isoamyle acetate
pyridine
gamma-Valerolactone
N,N-Dimethyloctanamide
trimethylene glycol
Glycerol-2-methyl monoether
2-pentanol
gamma-Valerolactone
N,N-Dimethyloctanamide
trimethylene glycol
Glycerol-2-methyl monoether
4-methyl-2-pentanone
1-decanol
Isobutyl acetate
Nopol
Glycerol-1,2-dibutyl ether
pyrrolidine
Geraniol
beta-Terpineol
Dimethyl 2-methylglutarate
EthylHexyllactate
2,6-dimethylpyridine
1-octanol
ethyl acetate
Glycerol-1,2,3-trimethyl ether
Dimethyl isosorbide
trifluoroethanol
Methyl ricinoleate
1,4-Cineol
Diethyl phthalate
Diisobutyl succinate
ethyl formate
Methyl ricinoleate
1,4-Cineol
Diethyl phthalate
Diisobutyl succinate
sulfolane
ethylene glycol
ethanol/water(70:30)vol
Glycerol-1,3-Dibutyl ether
2-Methyltetrahydrofuran
1-hexanol
Dimethyl succinate
N,N-Diethylolcapramide
3-Methoxy-3-methyl-1-butanol
2-Methyltetrahydrofuran
piperidine
Oleic acid
Menthanol
Tributyl citrate
formamide
1-decanol
Isobutyl acetate
Nopol
DEGDEE
1-decanol
Isobutyl acetate
Nopol
3,3-dimethyl-2-butanone
1-octanol
ethyl acetate
N,N-Dimethyldecanamide
ethyl benzoate
Diethyl adipate
Diethyl phthalate
Diisobutyl succinate
dibenzyl ether
Diethyl adipate
Diethyl phthalate
Diisobutyl succinate
propionitrile
Methyl ricinoleate
1,4-Cineol
Diisobutyl succinate
benzonitrile
Methyl ricinoleate
1,4-Cineol
Diisobutyl succinate
diethylamine
Glycerol-1,2,3-triethyl ether
Dihydromyrcenol
Cyclademol
di-n-propyl ether
Glycerol-1,2,3-triethyl ether
Dihydromyrcenol
Cyclademol
quinoline
ethanol/water(60:40)vol
Geraniol
alpha-Terpineol
diethyl ether
Geraniol
EthylHexyllactate
alpha-Terpineol
1-pentanol
1-butanol
PolyEthyleneGlycol 200
1,3-Dioxolane
2,4-dimethylpyridine
Glycerol-1,3-Dibutyl ether
Triethyl citrate
pentyl actetae
Oleic acid
Tributyl citrate
TEGDME
propylene carbonate
octadecanol
2,4-dimethyl-3-pentanone
ethyl acetate
N,N-Dimethyldecanamide
diphenyl ether
Butyl laurate
Isopropyl palmitate
acetic anhydride
ethylene glycol
Glycerol-1,3-Dibutyl ether
2-methyl-1-propanol
ethylene glycol
Glycerol-1,3-Dibutyl ether
3-pentanone
methyl acetate
Glycerol-1,2-dibutyl ether
2-aminoethanol
Glycofurol (n=2)
Glycerol-1-methyl monoether
methyl formate
Glycerol triacetate
Dimethyl phthalate
aniline
Glycerol triacetate
Dimethyl phthalate
2,4,6-trimethylpyridine
Glycerol triacetate
Dimethyl phthalate
phenetole
Decamethylcyclo-pentasiloxane
Diisoamylsuccinate
methyl tert-butyl ether
Dimethyl glutarate
Diethyl succinate
methyl benzoate
1,8-Cineol
Diethyl glutarate
diethyl carbonate
1,8-Cineol
Diethyl glutarate
anisole
1,8-Cineol
Diethyl glutarate
1,2-dimethoxyethane
ethanol/water(60:40)vol
alpha-Terpineol
N-formylmorpholine
benzyl alcohol
acetic acid
2,6-dimethyl-4-heptanone
Methyl ricinoleate
1,4-Cineol
di-isopropyl ether
Triethyl citrate
1-heptanol
Triethyl citrate
nitromethane
Ricinoleic acid
n-propylamine
Ricinoleic acid
carbon disulfide
p-Cymene
propyl formate
Menthanol
phenol
Menthanol
1,1-dichloroethane
isopropyl myristate
undecane
Isododecane
nonane
Isododecane
heptane
Isododecane
ethyl propionate
Isobutyl acetate
3-picoline
Glycerol-1-methyl monoether
morpholine
ethanol/water(90:10)vol
toluene
ethanol/water(20:80)vol
tetrachloroethylene
ethanol/water(20:80)vol
bromobenzene
ethanol/water(20:80)vol
triethylamine
Diisobutyl adipate
ethylenediamine
Diisobutyl adipate
1,9-decadiene
beta-Myrcene
cyclohexanol
acetone
butanone
acetone
2-butanol
acetone
N,N-dimethylaniline
1,8-Cineol
butyronitrile
1,4-Cineol
Z-1,2-dichloroethylene
trifluoroacetic acid
trichloroethylene
tributylamine
tributyl phosphate
THF
styrene
pentane
p-xylene
octane
o-xylene
o-dichlorobenzene
nitrobenzene
N,N-dimethylacetamide
N-methylpyrrolidinone
N-methylformamide
N-methylacetamide
N-methyl-2-piperidone
N-ethylformamide
N-ethylacetamide
n-butylamine
methylcyclohexane
mesitylene
m-xylene
m-dichlorobenzene
m-cresol
iodoethane
iodobenzene
HMPTA
hexane
hexadecane
fluorobenzene
ethylbenzene
DPMU
dodecane
DMF
DMEU
dimethylacetamide
diethylacetamide
dichloromethane
dibutylformamide
dibutyl ether
decane
cyclopentanone
cyclopentane
cyclohexene
cis-decaline
chloroform
chlorobenzene
carbon tetrachloride
bromoethane
benzene
benzaldehyde
acetonitrile
2,2,4-trimethylpentane
2-pentanone
2-methoxyethanol
2-chloropropane
1,4-dioxane
1,2-dichloroethane
1,1,3,3-tetramethyl urea
1,1,2,2-tetrachloroethane
1,1,1-trichloroethane
1,1-dichloroethylene
1-hexadecene
1-chloropropane
1-chlorobutane
1-bromobutane
Conclusion
Models for Abraham solvent coefficients were used to produce predictions for 203 solvents with unknown solvent coefficients, 118 of which are "sustainable solvents." We then presented two techniques whereby sustainable solvents could be chosen as possible replacements for non-green solvents. First, generally, by proximity (and classification) within a chemical space created by principal component analysis; and second, specfically, for compounds with known (or predicted) Abraham coefficients. Benzoic acid was used as an example and predictions matched well with measurements.
References
[1] Laurianne Moity, Morgan Durand, Adrien Benazzouz, Christel Pierlot, Valérie Moliniera, and Jean-Marie Aubry. Panorama of sustainable solvents using the COSMO-RS approach. Green Chem., 2012,14, 1132-1145, doi: 10.1039/C2GC16515E
[2] Paul C.M. van Noort. Solvation thermodynamics and the physical–chemical meaning of the constant in Abraham solvation equations. Chemosphere (2011), doi:10.1016/j.chemosphere.2011.11.073
[3] Michael H. Abraham and William E. Acree Jr. The solubility of liquid and solid compounds in dry octan-1-ol. Chemosphere, Volume 103, May 2014, Pages 26–34. doi: 10.1016/j.chemosphere.2013.10.095
[4] Solubility of benzoic acid in organic solvents .
Solvent Coefficients For Sustainable Solvents
Researchers: Jean-Claude Bradley, William E Acree Jr., and Andrew SID LangAll content, models and data are released as CC0 - the default license for all our ONS work.
This page is a duplicate (backup) of the original ONSChallenge page AbrahamSolventModel004
Objective
To investigate the predicted solvation properties of a comprehensive list of sustainable solvents.[1] These solvents will be compared to solvents with known Abraham solvent coefficients with the outlook of both potentially replacing existing solvents with safer sustainable solvents and to find potential new safe solvents whose solvation properties can then be approximated through their predicted solvent coefficients.Background
The Abraham general solvation model uses the LFERlog P = c + e E + s S + a A + b B + v V
where c,e,s,a,b,v are the solvent coefficients and E,S,A,B,V are the solute descriptors, see this brief discussion of the model. The Abraham coefficients are found via linear regression from measured data. The standard procedure is to allow the c-coefficient (the intercept) to float in the linear regression. It has been suggested that c should not be negative.[2] We suggest that little predictive ability will be lost if we just require c to be zero. This will also allow easier comparison between solvents. Thus in order to compare both current solvents with each other and potential new solvents with current solvents, we decided to re-calculate the coefficients for known solvents e_0, s_0, a_0, b_0, v_0 by making c zero, see Solvents Model 003.
Then the models themselves can be applied to new compounds to get predicted solvent coefficients which in turn can be used to predict log P for compounds with known Abraham descriptors using the following equation:
log P = e_0 E + s_0 S + a_0 A + b_0 B + v_0 V
Procedure
The supplementary data from the paper by Moity et. al. consists of two files containing lists of solvents with names and CAS numbers. The first file of classical organic solvents was downloaded and ChemSpider IDs (CSIDs) and structures (SMILES) were added by cross-referencing the names and CAS numbers on ChemSpider. The second file of green solvents was downloaded and ChemSpider IDs and structures (SMILES) were added by cross-referencing the names and CAS numbers on ChemSpider. There was one row (menthanyl acetate) that listed a CAS for a similar but different compound (menthyl acetate). Both compounds were kept. CDK descriptors were calculated for all solvents using Rajarshi Guha's CDK DESCUI (v 1.4.2). The option to "Add explicit H" was selected and the descriptors were output as csv files (comma delimited). These files were then loaded into R (R i386 3.0.0) to calculate the predicted solvation coefficients using the following code:library("randomForest") #for modeling (randomForest 4.6-7) setwd(".../SustainableSolvents") myclassicdata = read.csv(file="classicdescriptors.csv",head=TRUE,row.names="Title") mygreendata = read.csv(file="greendescriptors.csv",head=TRUE,row.names="Title") ## load the model for the 'e' coefficient mydata.rf <- readRDS("erfmodel") ## predict using the random forest model test.predict <- predict(mydata.rf,myclassicdata) ## write the predictions to the working directory write.csv(test.predict, file = "RFTestPredictClassice.csv") ## predict using the random forest model test.predict <- predict(mydata.rf,mygreendata) ## write the predictions to the working directory write.csv(test.predict, file = "RFTestPredictGreene.csv") ## Similarly for the other coefficients s,a,b, and v mydata.rf <- readRDS("srfmodel") test.predict <- predict(mydata.rf,myclassicdata) write.csv(test.predict, file = "RFTestPredictClassics.csv") test.predict <- predict(mydata.rf,mygreendata) write.csv(test.predict, file = "RFTestPredictGreens.csv") mydata.rf <- readRDS("arfmodel") test.predict <- predict(mydata.rf,myclassicdata) write.csv(test.predict, file = "RFTestPredictClassica.csv") test.predict <- predict(mydata.rf,mygreendata) write.csv(test.predict, file = "RFTestPredictGreena.csv") mydata.rf <- readRDS("brfmodel") test.predict <- predict(mydata.rf,myclassicdata) write.csv(test.predict, file = "RFTestPredictClassicb.csv") test.predict <- predict(mydata.rf,mygreendata) write.csv(test.predict, file = "RFTestPredictGreenb.csv") mydata.rf <- readRDS("vrfmodel") test.predict <- predict(mydata.rf,myclassicdata) write.csv(test.predict, file = "RFTestPredictClassicv.csv") test.predict <- predict(mydata.rf,mygreendata) write.csv(test.predict, file = "RFTestPredictGreenv.csv")Results
The results are listed in the tables below.Solvents With Known Coefficients
Note the new (updated) coefficients for tributyl phosphate (as compared to the values listed in model003).Predicted Coefficients e_0, s_0, a_0, b_0, and v_0 for classic solvents
Predicted Coefficients e_0, s_0, a_0, b_0, and v_0 for green solvents
Chemical Space
To explore the chemical space, each coefficient was multiplied by the average values for solute coefficients (E=0.884, S=0.996, A=0.174, B=0.487, V=1.299) from ADModel003. Then PCA was computed for (eE, sS, aA, bB, vV) using the following code:setwd("C:/Dropbox/research/AbrahamSolventCoefficients/MakingCZero/SustainableSolvents") mydata = read.csv(file="SustainableSolventsPaperCompletePCA.csv",head=TRUE,row.names="Title") pc1 <- prcomp(mydata, scale. = T) x <- pc1$x summary(pc1) [output] Importance of components: PC1 PC2 PC3 PC4 PC5 Standard deviation 1.6844 1.0788 0.7738 0.55322 0.30719 Proportion of Variance 0.5674 0.2328 0.1197 0.06121 0.01887 Cumulative Proportion 0.5674 0.8002 0.9199 0.98113 1.00000 [output] write.csv(x, file = "PCA.csv")The following image was generated using Tableau Public v8.2. Solvents with known (measured) coefficients are squares and solvents with ony predicted coefficients are circles. Green solvents are coloured green.Lots of the green solvents around methyl palmitate seem to occupy a new region of the chemical space not covered by current solvents with measured coefficients. Similarly ethyl lactate and acetic acid seem to be possibly useful novel solvents with different solvation properties than those with measured coefficients. On the other hand, there are not many green solvent alternatives for classical solvents around p-xylene.
Alternative Solvents
We can use chemical space information to make general green solvent recommendations (as is done with GSK and others). For example the predicted solvent coefficients for propylene glycol (green) are 'close' to the experimental values for methanol.This suggests that propylene glycol can used as a general substitute for methanol. To examine this we compare the solubility values of compounds that have values in both solvents. We do not expect to see exact agreement (as the coefficients are firstly predicted and secondly not identical) but we do see that solubility values are of the same order in most cases. The biggest discrepancy being for dimethyl fumerate. The measured solubility values are reported to be 0.182 M and 0.005 M for methanol and propylene glycol respectively. This is not something one would necessarily expect and the reported value or values may be incorrect. Substituting propylene glycol for methanol is just one of many recommendations that can be made using this technique.
Similarly, by measuring the difference between two solvent logP values (and using the compound-specific solute descriptors E,S,A,B,V) as follows:
d = logP_1 - logP_2 = logS_1 - logS_2 = (e_01-e_02)*E + (e_01-e_02)*S + (e_01-e_02)*A + (e_01-e_02)B + (e_01-e_02)*V
we get a good idea of pairs of solvents with similar solvation properties for specific individual compounds (the closer to zero, the better). By comparing classical (non-green) solvents with all green solvents, we can make recommended green solvent replacements for non-green solvents. As an example, we calculate recommended green solvent replacements for benzoic acid, see table below, using descriptors E=0.730, S=0.90, A=0.59, B=0.40, V=0.9317 (using d=0.01).[3] The recommendations make sense in general and several examples can be explicitly verified by comparing actual measured solubility values.[4]. Such a procedure can easily be done for other specific compounds with known or predicted Abraham descriptors to find alternative green solvents in varying specific circumstances (solubility, partition, etc). Note the solvents at the bottom with no green alternatives.
Conclusion
Models for Abraham solvent coefficients were used to produce predictions for 203 solvents with unknown solvent coefficients, 118 of which are "sustainable solvents." We then presented two techniques whereby sustainable solvents could be chosen as possible replacements for non-green solvents. First, generally, by proximity (and classification) within a chemical space created by principal component analysis; and second, specfically, for compounds with known (or predicted) Abraham coefficients. Benzoic acid was used as an example and predictions matched well with measurements.References
[1] Laurianne Moity, Morgan Durand, Adrien Benazzouz, Christel Pierlot, Valérie Moliniera, and Jean-Marie Aubry. Panorama of sustainable solvents using the COSMO-RS approach. Green Chem., 2012,14, 1132-1145, doi: 10.1039/C2GC16515E[2] Paul C.M. van Noort. Solvation thermodynamics and the physical–chemical meaning of the constant in Abraham solvation equations. Chemosphere (2011), doi:10.1016/j.chemosphere.2011.11.073
[3] Michael H. Abraham and William E. Acree Jr. The solubility of liquid and solid compounds in dry octan-1-ol. Chemosphere, Volume 103, May 2014, Pages 26–34. doi: 10.1016/j.chemosphere.2013.10.095
[4] Solubility of benzoic acid in organic solvents .